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PREFACE 


This  Note  describes  Rand's  latest  study  of  cost  estimating 
relationships  for  new  military  aircraft  turbine  engine  development  and 
production  programs.  The  recent  availability  of  data  for  several  new 
engine  programs  and  new,  more  powerful  statistical  tools  motivated  us  to 
develop  turbine  engine  cost  estimating  methods  that  are  simple  and  easy 
to  use,  but  accurate  enough  for  long  range  planners.  Such  methods  are 
suitable  for  conceptual  exercises,  planning  studies,  Independent  Cost 
Analyses  (ICA),  and  other  situations  for  which  conventional  detailed 
estimating  procedures  are  either  impractical  or  overly  time-consuming. 
The  estimating  relationships  developed  should  be  of  interest  to  those 
persons  throughout  the  Air  Force  and  elsewhere  in  DOD  who  are  concerned 
with  long  range  planning  and  the  preparation  or  review  of  turbine  engine 
development  and  production  program  costs. 

This  research  was  originally  undertaken  as  part  of  the  Project  AIR 
FORCE  project  "Cost  Analysis  Methods  for  Air  Force  Systems,"  which  has 
since  been  superseded  by  "Air  Force  Resource  and  Financial  Management 
Issues  for  the  1980s"  in  the  Resource  Management  Program. 
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SUMMARY 


This  Note  presents  equations  for  estimating  development  and 
production  costs  and  time  of  arrival  for  U.S.  military  turbojet  and 
turbofan  engines.  Interest  in  further  investigation  of  aircraft  turbine 
engine  cost  estimating  relationships  (CERs)  grew  out  of  the  availability 
of  data  for  engines  recently  developed,  and  experience  with  the  CERs  in 
Rand's  computer  model  for  estimating  Development  and  Procurement  Costs 
of  Aircraft  (DAPCA) . 

After  establishing  criteria  for  selecting  explanatory  variables  and 
CERs,  regression  analysis  was  applied  to  the  expanded  data  base  to 
develop  improved  relationships  for  the  cost  of  development  to  the  model 
qualification  test  (MQT) ,  total  development  cost,  and  the  cumulative 
average  price  at  the  1000th  production  engine.  The  engine 
characteristics  that  best  explain  development  cost  through  MQT  and 
production  cost  are  maximum  thrust  of  the  engine  at  sea-level-static 
conditions,  an  indicator  of  engine  size;  Mach  number,  a  measure  of 
performance;  and  turbine  inlet  temperature,  the  dominant  technical 
parameter  in  the  engine  cycle.  For  total  development  cost,  which 
includes  the  expenses  involved  in  developing  a  new  engine  to  MQT,  plus 
the  cost  to  correct  service  related  deficiencies  and  costs  for  continual 
performance  and  reliability  improvements  over  time,  the  derived  equation 
includes  a  production  quantity  term  as  well  as  thrust  and  Mach  number. 

The  estimating  relationship  for  time  of  arrival  (TOA)  was  also 
refined  in  this  study.  (The  TOA  method  links  certain  engine  performance 
characteristics  with  time  to  provide  a  measure  of  an  engine's  state  of 
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the  art).  The  refined  TOA  model  is  based  on  29  U.S.  military  turbojet 
and  turbofan  engines  developed  and  produced  during  the  past  30  years. 

The  model  predicts  the  man-rated  MQT  date  as  a  function  of  certain  of 
the  engine's  performance  and  design  parameters.  The  parameters  include 
engine  thrust  to  weight  ratio,  turbine  inlet  temperature,  and  specific 
fuel  consumption,  which  are  the  three  most  important  technical 
characteristics  in  the  turbine  engine  development  process. 

Several  new  diagnostic  statistics,  generated  as  part  of  the 
analysis,  provide  more  insight  into  the  estimation  error  and  into  the 
influence  of  specific  data  points  on  the  derived  equations  than  was 
available  in  earlier  studies.  These  diagnostics  should  help  estimators 
understand  the  sensitivities  of  the  CERs ,  and  therefore  the  estimated 
costs,  to  particular  characteristics  of  individual  engines  in  the  data 
base.  Coupled  with  the  expanded  data  base  and  selection  criteria,  these 
regression  diagnostics  have  also  helped  identify  CERs  that  intuitively 
satisify  our  engineering  sense  and  generally  have  fewer  explanatory 
variables,  while  improving  their  predictive  capability. 

These  models  are  intended  for  use  by  long  range  military  pTanners 
attempting  to  determine  costs  for  new  systems--especially  those  of  a 
technically  advanced  nature--so  that  better  estimates  can  be  made.  All 
parameters  needed  are  readily  available  at  an  early  stage  of  planning 
for  a  new  system.  Care  must  be  exercised  in  using  these  models  to 
ensure  that  inputs  are  consistent  with  the  data  base  used  in  this  study. 
For  example,  cost  estimates  will  reflect  military  technology  and  the 
manner  in  which  programs  were  conducted  during  the  1950s,  1960s,  and 
1970s.  If  an  engine  is  developed  that  is  not  in  the  mainstream  trend, 
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such  as  a  variable  cycle  or  lift  engine,  the  estimating  relationship 
described  may  not  apply.  To  the  extent  that  a  new  program  differs 
from  historical  conditions,  extrapolation  will  be  necessary. 
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SYMBOLS 


Symbol  Definition 


AIR  Airflow  through  the  engine  in  lbs/sec  at 

maximum  rated  thrust. 

CIP  Component  Improvement  Program 

DEVTIME  Development  time  from  start  to  MQT,  months 

FFER  Full  Flight  Envelope  Release 

IFR  Initial  Flight  Release 

ISR  Initial  Service  Release 

MACH  Maximum  flight  envelope  Mach  number  (measure  of  speed 

related  to  speed  of  sound);  1.0  for  engines 
designed  for  subsonic  flight. 

MQT  Model  Qualification  Test 

MQTDEVCOST  Development  cost  to  MQT,  millions  of  FY  1980  dollars 
OCR  Operational  Capability  Release 

PFRT  Preliminary  Flight  Rating  Test 


PROCOST 

QMAX 


Cumulative  average  production  cost  through  the  1000th 
engine;  thousands  of  FY  1980  dollars. 

2 

Maximum  dynamic  pressure  in  flight  envelope,  lb/ft 


QTY  Quantity  of  engines  produced. 


SFCMIL  Specific  fuel  consumption  at  military  thrust,  sea-level 

static  conditions  (lb/hr/lb  thrust). 

TEMP  Maximum  turbine  inlet  temperature  (degrees  Rankine) . 

THRAIR  Thrust  to  air  flow  ratio  (THRMAX/AIR) . 

THRMAX  Maximum  rated  thrust  at  sea- level  static  conditions, 

including  afterburner  thrust  if  any  (1b). 


THRWGT 


Thrust  to  weight  ratio  (THRMAX/WGT) . 
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TOA  Time  of  arrival  at  successful  MQT  (in  calendar  quarters 

since  the  third  quarter  of  1942). 

TOTDEVCOST  Total  cost  of  development  (millions  of  FY  1980  dollars) 
including  development  to  MQT  and  product  improvement, 
evaluated  at  dates  of  MQTs  other  than  the  first  for  a 
particular  engine  model. 

TOTPRS  Engine  pressure  term  (psf)  computed  as  the  product  of 

engine  maximum  pressure  ratio  and  the  maximum  dynamic 
pressure  of  the  engine  design  envelope. 

WGT  Engine  dry  weight  (lb). 

A  TOA  The  interval  between  TOA  and  the  actual  date  an  engine 

passes  its  MQT.  (A  TOA  =  calculated  TOA  -  actual  TOA 
date) . 
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I .  INTRODUCTION 


A  modern  military  aircraft  consists  of  four  main  subsystems: 
airframe,  avionics,  propulsion  and  armament.  The  propulsion  system  of 
most  new  military  aircraft  is  a  turbine  engine.  Modern  turbine  engines 
are  complex  and  costly.  A  turbine  engine  can  cost  over  a  billion 
dollars  for  development  and  product  improvement  over  its  lifetime  and 
accounts  as  much  as  25  percent  of  the  flyaway  cost  of  a  fighter 
aircraft.  It  follows  that  reasonably  accurate  estimates  of  engine 
development  and  production  cost  are  needed  during  planning  for  a  new 
engine  in  an  aircraft  system. 

For  conceptual  planning  studies,  preliminary  tradeoff  analyses,  and 
the  like,  it  would  be  desirable  to  have  a  simple  and  easy  to  use 
procedure  for  estimating  the  cost  of  conceptual  or  proposed  turbine 
engines  within  +  25  percent.  Such  a  procedure  should  allow  the  analyst 
to  obtain  estimates  at  a  time  when  descriptive  information  (e.g.,  engine 
characteristics)  is  imperfect  and  often  incomplete,  and  there  are  no 
engine  development  program  data  (e.g.,  number  of  test  engines  and  test 
hours) . 

Consequently  Rand  has,  over  the  years,  conducted  a  number  of 
studies  of  turbine  engine  cost  estimation.  One  of  the  most  widely  known 
efforts  produced  cost  estimating  relationships  (CERs)  that  were  later 
incorporated  into  a  computer  model  of  aircraft  system  costs  generally 
known  as  DAPCA  (Development  and  Procurement  Costs  of  Aircraft) .[ 1 ] 

[1]  J.  R.  Nelson  and  F.  S.  Timson,  Relating  Technology  to 
Acquisition  Costs :  Aircraft  Turbine  Engines ,  R-1288-PR,  The  Rand 
Corporation,  March  1974.  The  estimating  relationships  developed  in  this 
research  are  incorporated  in  the  DAPCA  model:  H.  E.  Boren,  Jr.,  A 
Computer  Model  for  Estimating  Development  and  Procurement  Costs  of 
Aircraft  (DAPCA  III),  R-1854-PR,  The  Rand  Corporation,  March  1976. 
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Several  circumstances  make  it  appropriate  now  to  take  another  look 
at  engine  development  and  production  costs.  Experience  with  the  DAPCA 
engine  equations  has  shown  them  to  be  quite  sensitive  to  small  changes 
or  errors  in  the  parameters  used  to  drive  the  estimates.  When  using 
DAPCA,  some  investigators  have  also  reported  a  tendency  toward 
underestimation  for  the  latest  high  performance  engines.  Furthermore, 
three  new  engines  have  been  developed  since  the  DAPCA  equations  were 
derived.  Adding  these  engines  to  the  analysis  offered  two  advantages: 
First,  we  can  expect  that  they  are  more  similiar  to  the  next  generation 
of  turbine  engines  that  will  be  developed,  and  second,  increasing  the 
sample  size  provides  additional  confidence  in  the  predictive  capability 
of  the  CERs .  Recently,  some  regression  diagnostics  have  become 
available  that  greatly  increase  the  amount  of  information  generated 
during  regression  analysis.  Several  of  these  provide  useful  information 
about  collinearity  and  the  influence  of  individual  data  observations  on 
the  various  estimated  parameters.  These  statistics  seemed  well  suited 
to  evaluating  the  difficulties  some  analysts  were  encountering  with  the 
DAPCA  equations. 

The  hypothesis  of  this  study  was  that  using  recent  engine  data 
along  with  newly  developed  statistical  methods  would  make  it  possible  to 
develop  more  stable  and  accurate  CERs  that  would  be  useful  in  predicting 
costs  of  large,  modern  turbofan  and  turbojet  aircraft  engines.  Such 
equations  would  be  used  only  before  or  early  in  the  planning  and 
development  phase  of  an  engine  program,  when  no  drawings  or  bills  of 
material  exist .  By  the  time  development  is  well  along  and  higher 
resolution  estimates  are  needed,  greater  accuracy  can  be  achieved  by 
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extrapolating  from  actual  costs  of  nameplate  engines  than  the  parametric 
methods  described  here. 

Historically,  the  engine  program  milestones  distinguished  for 
estimating  purposes  were  as  follows: 


°  Preliminary  Flight  Rating  Test  (JPFRT) . 

A  series  of  individual  tests  that  in  combination 
demonstrated  that  the  engine  was  suitable  for  use  in 
experimental  flight  testing. 

o  Model  Qualification  Test  (MQT) .  A  series 

of  individual  tests  that  in  combination  demonstrated 
that  the  engine  was  suitable  for  production. 

o  Delivery  of  the  Nth  Engine .  The  time  when 

some  number,  say  1000,  of  production  engines  have 
been  delivered. 


Difficulties  encountered  with  recent  aircraft  turbine  engines  have 
led  to  some  important  changes  in  development  emphasis  and  new  test 
procedures  to  insure  delivery  of  more  supportable  engines.  The  Air 
Force  has  shifted  to  a  four-step  development  process.  The  proposed 
milestones  in  this  process  are: 


o  Initial  Flight  Release  ( IFR) .  A  series 

of  individual  tests  that  in  combination  demonstrate 
the  engine  is  suitable  for  limited  flight  testing. 

o  Full  Flight  Envelope  Release  (FFER) .  A  series 

of  individual  tests  that  in  combination  demonstrate 
that  the  engine  is  suitable  for  flight  testing 
throughout  the  full  aircraft  performance  flight 
envelope . 

o  Initial  Service  Release  (ISR) .  A  series 

of  individual  tests  that  in  combination  demonstrate 
the  engine  is  suitable  for  low  rate  production. 

o  Operational  Capability  Release  (OCR) .  A  series 

of  individual  tests  that  in  combination  demonstrate 
the  engine  is  suitable  for  full  production  release. 


No  engines  have  yet  been  developed  under  the  new  four-step 
development  process.  This  fact  coupled  with  the  availability  of 
historical  data  in  traditional  formats  led  us  to  continue  using 
traditional  engine  development  milestones.  Nevertheless,  we  can  offer 
some  suggestions  on  making  the  transition  between  the  two  approaches. 

The  old  PFRT  milestone  would  roughly  correspond  to  IFR  or  step  one. 
The  MQT  milestone  becomes  harder  to  identify,  but  should  approximate  ISR 
or  step  three.  Step  four  historically  would  have  occurred  in  the  early 
stages  of  the  Component  Improvement  Program  (CIP) .  However,  under  the 
PFRT/MQT  development  concept,  additional  requirements  (and  thus  costs) 
were  continually  added  as  the  military  specifications  evolved  over  the 
years.  It  can  be  considered  that  the  four-step  development  process 
merely  represents  further  evolution  and  that  any  additional  costs 
incurred  between  ISR  and  OCR  will  be  captured  within  the  accuracy  error 
of  the  model. 

This  study  derives  new  CERs  from  an  expanded  data  base  and  uses  new 
diagnostic  statistics  to  screen  the  CERs  and  to  evaluate  the 
characteristics  of  the  preferred  set.  Section  II  of  this  Note 
identifies  the  data  used,  explains  the  criteria  and  rationale  for 
selecting  explanatory  variables,  and  describes  recently  developed 
regression  diagnostics.  Section  III  presents  the  preferred  set  of  CERs. 
Comments  on  these  results;  a  comparison  with  DAPCA  equations; 
suggestions  for  the  use  of  these  CERs  and  directions  for  possible  future 
research  are  discussed  in  Sec.  IV.  Supporting  statistics  for  the 
predictive  models  are  available  in  the  appendix. 


II.  RESEARCH  PROCEDURE 


Before  we  review  the  data  base  and  the  analytical  procedure  used, 
it  is  appropriate  to  consider  the  objectives  of  this  analysis  and  the 
limitations  they  impose.  Engine  development  and  production  costs  are 
ordinarily  estimated  by  manufacturers  in  great  detail,  frequently  by  the 
application  of  standard  engineering  and  manufacturing  hours  to  each 
operation  and  building  up  to  a  total.  Whatever  the  degree  of  accuracy 
obtained  through  this  process,  it  is  very  time-consuming  and  requires  a 
detailed  knowledge  of  the  development  and  manufacturing  process  as  well 
as  up-to-date  information  on  standard  hours  and  material  requirements 
and  costs  for  the  various  fabrication,  assembly,  and  test  procedures. 

For  planning  studies,  preliminary  tradeoff  analysis,  and  the  like,  it  is 
desirable  to  have  a  simple  procedure  for  estimating  engine  costs  at 
important  program  milestones. 

Early  in  the  planning  cycle,  when  resources  are  limited  and 
detailed  knowledge  of  design  specification  is  unavailable,  parametric 
estimating  models  requiring  few  inputs  have  been  found  to  provide  cost 
estimates  that  are  sufficiently  accurate  for  these  purposes.  Parametric 
estimating  relationships  were  sought  in  this  study  for  four  dependent 
variables:  the  cost  of  development  up  to  successful  completion  of  MQT 
(MQTDEVCOST) ;  total  development  cost  (TOTDEVCOST) ,  which  includes  MQT 
and  all  subsequent  product  improvement  during  the  life  of  the  engine; 
1000th  unit  cumulative  average  production  cost  (PROCOST);  and  time  of 
arrival  (TOA)  at  MQT.  The  last  term  attempts  to  quantify  the  level  of 
technology  for  a  given  engine  development  program.  Unlike  some  earlier 
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Rand  studies,  time  of  arrival  is  not  used  as  an  input  to  the  cost 
estimating  equations  derived  here.  Rather  it  is  provided  to  give  the 
estimator  an  indication  of  the  degree  of  risk  associated  with  an  engine 
program. 

TECHNICAL  DATA 

Technical  data  came  from  several  sources.  Most  were  originally 
provided  to  Rand  for  studies  conducted  earlier,  but  other  data  were 
provided  directly  from  the  various  military  offices  and  manufacturers 
involved  in  the  development  and  production  of  individual  engines.  Table 
1  presents  the  data  used  to  develop  the  MQTDEVCOST,  PROCOST,  and  TOA 
relationships .[ 1]  Three  engines  have  been  added  to  the  data  base-- 
the  F100,  F101 ,  F404. 

The  engine  characteristics  used  are  those  of  a  particular  model, 
depending  on  the  context.  Aircraft  engines  evolve  through  a  succession- 
of  different  versions.  Normally,  several  versions  are  in  production  at 
the  same  time  and  on  the  same  production  line.  When  development  costs 
are  examined  through  MQT,  the  engine  characteristics  used  are  those  of 
the  first  production  model.  When  production  costs  are  examined, 
however,  often  during  the  course  of  an  engine  production  program,  new 
models  are  introduced  having  performance  and  technical  attributes  that 
differ  substantially  from  the  original  version.  For  these  cases,  a  sub¬ 
jective  selection  of  the  most  representative  series  engine  was  made. 

Table  2  gives  the  technical  and  production  quantity  data  used  in 
the  TOTDEVCOST  model.  Data  for  the  engine  series  that  passed  the  first 
MQT  for  each  model  are  listed  first.  Subsequent  entries  for  the  same 

[1]  In  order  to  make  the  report  suitable  for  general  distribution, 
proprietary  cost  and  classified  technical  data  are  not  shown. 
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engine  denote  a  data  value  corresponding  to  one  of  a  series  of  MQTs 
in  the  continuing  development  of  that  engine. 


Tables  1  and  2  reflect  the  consideration  in  this  study  of  turbojet 
and  turbofan  engines  only.  Turboprop  and  turboshaft  engines  included  in 
some  earlier  Rand  analyses  have  been  excluded.  For  future  Air  Force 
systems,  these  engines  are  expected  to  be  of  less  relevance  than 
turbofan  or  turbojet  engines. 

COST  DATA 

With  data  collected  from  diverse  sources,  comparability  of  costs 
becomes  a  serious  problem.  After  insuring  that  all  definitional 
differences  were  eliminated,  three  major  adjustments  were  made- -one  for 
price- level  changes,  one  for  quantity,  and  one  for  program 
peculiarities--so  that  comparisons  could  be  made  on  the  basis  of 
constant  dollars,  at  a  specified  production  quantity,  and  for  generally 
similar  acquisition  strategies. 

Price-Level  Adjustment 

The  term  "cost"  as  used  in  this  Note  refers  to  the  total  price  to 
the  government  expressed  in  Fiscal  Year  (FY)  1980  dollars  of  an  engine 
development  or  production  program.  All  costs  were  adjusted  to  this 
price  level  with  the  index  shown  in  Table  3.  This  index,  made  up  of 
several  weighted  labor  and  material  sub-indices,  was  developed 
specifically  for  aircraft  turbine  engines  by  the  Air  Force  Systems 
Command  Cost  Analysis  Improvement  Group  (AFSC/CAIG)  in  1981. [2]  Use  of 

[2]  A.  Fatkin,  AFSC/CAIG  Research  REPORT  NR1  Generic  Inflation 
Indexes  for  Weapon  Systems ,  Hq,  AFSC/ACCE,  July  1981. 
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this  index  to  adjust  total  engine  cost  implies  that  the  development 
process  is  similar  to  the  production  process,  which  it  is  not.  The 
development  of  a  specific  RDT&E  index  was  beyond  the  scope  of  this 
effort,  however.  Consequently,  we  have  used  a  single  index  as  a  primary 
indicator  of  price  movement. 

Adjustment  for  Quantity 

Some  of  the  engines  in  the  data  sample  were  produced  in  thousands 
and  others  in  hundreds.  To  compare  these  engines,  it  was  necessary  to 
establish  production  cost  baselines  for  all  engines  at  a  single 
production  quantity  (1000  units).  For  those  engines  that  were  not 


Table  3 

AFSC/CAIG  AIRCRAFT  ENGINE  COST  INDEX 


Year 

Index 

Year 

Index 

1946 

6.44 

1964 

3.25 

1947 

6.16 

1965 

3.17 

1948 

5.60 

1966 

3.06 

1949 

5.42 

1967 

2.95 

1950 

5.13 

1968 

2.82 

1951 

4.44 

1969 

2.69 

1952 

4.19 

1970 

2.52 

1953 

4.15 

1971 

2.39 

1954 

4.06 

1972 

2.29 

1955 

3.88 

1973 

2.14 

1956 

3  ,.60 

1974 

1.97 

1957 

3.50 

1975 

1.69 

1958 

3.53 

1976 

1.56 

1959 

3.49 

1977 

1.42 

1960 

3.45 

1978 

1.31 

1961 

3.40 

1979 

1.17 

1962 

3.37 

1980 

1.00 

1963 

3.31 

11 


produced  in  large  quantities  (or  their  production  runs  had  not  yet 
reached  1000  units--F404)  costs  were  extrapolated  along  the  established 
slope  or,  in  the  case  for  the  F101  engine,  estimated  values  were 
obtained  from  the  USAF. 

Program  Adjustment 

Although  each  engine  program  is  unique,  the  F100  and  F404  had  such 
unusual  characteristics  that  we  adjusted  these  costs  to  make  them 
comparable  to  other  programs . 

The  F100  engine  development  program,  for  example,  began  as  a  joint 
effort  with  both  the  USAF  and  Navy  developing  a  common  engine  core  that 
was  planned  to  evolve  into  the  F100  for  use  in  the  F-15  and  the  F401  for 
use  in  the  F-14B.  Total  development  dollars  spent  by  both  services  were 
more  than  that  needed  for  a  single  engine,  but  less  than  that  required 
to  develop  two  engines.  Thus  the  development  cost  values  used  for  the 
F100  engine  does  not  include  those  dollars  spent  on  the  Navy  version. 

An  adjustment  was  also  made  to  the  F404  development  cost.  The  F404 
engine,  powerplant  for  the  Navy's  F-18,  had  its  genesis  as  the  YJ101,  a 
powerplant  funded  by  the  Air  Force  and  used  in  the  YF-17.  Just  using 
Navy  F404  development  dollars  ignores  the  fact  that  considerable  Air 
Force  resources  were  spent  on  the  YJ101  that  eventually  benefited  the 
F404.  In  this  case,  official  Navy  F404  development  costs  were  adjusted 
upward.  This  adjustment  was  made  based  on  discussions  with  Air  Force, 
contractor,  and  Navy  personnel  as  to  how  much  of  the  YJ101  development 
cost  applied  to  the  F404. 


ANALYTICAL  TECHNIQUES 

This  study  used  ordinary  least  squares  regression  as  the  basic  tool 
for  developing  estimating  relationships . [3]  Before  applying  statistical 
methods,  we  established  criteria  for  selecting  the  explanatory  variables 
and  for  CER  selection.  The  subsections  that  follow  discuss  these 
criteria  and  the  statistical  methods  that  were  applied  to  the  data 
collected. 

Selection  of  Explanatory  Variables 

Two  criteria  were  established  before  a  variable  was  tested  for 
significance: 

1.  The  variable  had  to  be  logically  related  to  cost. 

2.  The  variable  had  to  be  known  with  a  fair  degree  of  accuracy 
during  the  concept  formulation  phase. 

The  search  for  suitable  explanatory  variables  began  with  the 
hypothesis  that  the  cost  of  an  engine  is  a  function  of  (1)  the  size  of 
the  engine,  (2)  the  level  of  technology/performance  incorporated  into 
the  engine,  and  (3)  the  time  during  which  the  engine  is  developed  and 
produced . 

Inasmuch  as  engine  size  affects  the  amount  of  raw  materials  that  go 
into  the  engine,  the  size  of  the  test  facilities,  and  the  size  of  the 
machines  used  in  manufacturing,  it  appears  reasonable  to  expect  large 
engines  to  cost  more  than  small  ones.  Technical  complexity  or 
difficulty  is  also  a  cost  driver.  The  predominant  cost  of  an  aircraft 


[3]  See  N.  Draper  and  H.  Smith,  Appl 
ed.,  John  Wiley  and  Sons,  New  York,  1981. 
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engine  development  program  is  incurred  in  achieving  acceptable  engine 
reliability  and  durability  levels.  The  primary  method  for  achieving  the 
desired  levels  is  full  scale  testing,  which  becomes  increasingly  more 
expensive  and  complex  as  the  technology  becomes  more  sophisticated.  In 
the  production  phase,  high  technology  and  performance  level  translate 
into  exotic  materials  and  sophisticated  manufacturing  techniques,  both 
of  which  drive  up  production  costs. 

A  variable  that  captures  the  time  trend  was  thought  to  be  important 
because  engine  complexity  and  the  required  number  of  test  hours --hence, 
cost--have  increased  and  are  continuing  to  increase  over  time.  Thus,  we 
hypothesized,  an  ideal  CER  should  have  at  least  three  explanatory 
variables:  one  that  is  indicative  of  engine  size,  one  measuring  the 
level  of  technology/performance,  and  one  reflecting  the  time  frame 
during  which  the  engine  is  developed  or  produced. 

This  hypothesis  and  the  two  criteria  described  above  limit  our 
choice  of  independent  variables  to  those  shown  in  Table  4.  Statistical 
techniques  described  in  the  next  section  were  applied  to  these 
variables . 

Of  course,  engine  characteristics  alone  cannot  explain  variability 
in  program  costs.  Schedule,  production  rates,  commonality  between 
engine  types,  management,  funding,  state-of-the-art  advance, 
availability  of  labor,  and  investment  in  capital  tools  all  affect  costs 
but  cannot  be  captured  in  a  simple  model.  A  parametric  cost  model  based 
on  data  from  a  wide  assortment  of  programs  is  not  sensitive  to  small 
changes,  and  it  assumes  that  every  program  will  have  its  fair  share  of 
technical,  programming,  and  funding  problems.  Only  when  an  explanatory 


Size 

Performance/ Technology 

Time 

Thrusta 

Turbine  inlet  temperature3 

Time  of 
Arrival3 

Weight 

Thrust  to  weight  ratio3 

Airflow 

Mach  number3 

Total  pressure 

Specific  fuel  consumption 

Thrust  per  pound  of  airflow 

aThese  variables  may  be  easier  to  obtain  in  a 
long  range  planning  study  than  the  others. 


variable  demonstrates  a  consistent  and  perceptible  influence  on  a 
variety  of  programs  can  it  be  included  in  a  cost  model. 

Limiting  the  Number  of  Explanatory  Variables 

It  is  a  generally  accepted  maxim  among  analysts  and  statisticians 
that  the  fewer  independent  variables  a  model  has,  the  better  it  will 
stand  the  test  of  time.  There  are  practical  and  theoretical  reasons  for 
this  view.  The  most  important  are: 

1.  Models  with  too  many  variables  usually  result  in  large 
prediction  variances  because  many  parameters  have  to  be 


estimated. 
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2.  Multicollinearity  is  more  likely  to  occur  with  a  large  number 
of  variables. 

When  comparing  competing  models,  therefore,  we  favored  those  that 
had  the  fewest  explanatory  variables  while  maintaining  predictive 
quality. 

Criteria  for  CER  Selection 

The  development  of  potentially  useful  CERs  requires  the  selection 
of  the  "best"  equation.  Our  interpretation  of  "best"  is  predictive 
capability  rather  than  statistical  quality.  Realizing  predictive 
capability  requires  more  than  fitting  a  line  to  a  collection  of  data 
points;  the  selected  independent  variables  should  be  key  measures  of 
underlying  trends  and  not  be  overly  influenced  by  one  or  a  few  data 
points.  For  example,  in  several  cases  some  independent  variables 
slightly  improved  the  statistical  properties  of  a  few  equations. 

However,  they  were  eliminated  from  consideration  because  they  failed  to 
meet  the  above  criteria. 

CER  selection  criteria  also  required  the  signs  of  the  coefficients 
in  the  equations  to  be  consistent  with  intuitive  notions  of  what 
constitutes  more  technologically  advanced  achievement  with  time: 
positive  coefficients  on  variables  that  are  more  difficult  and  hence 
more  costly  to  achieve,  and  negative  coefficients  on  variables  for  which 
smaller  values  are  more  difficult  to  achieve.  For  example,  one  would 
expect  rising  turbine  temperature  to  increase  costs,  and  this  variable 
does  have  a  positive  coefficient  in  the  derived  equation.  In  all  cases 
the  equations  satisfied  these  criteria. 


Statistical  Analysis 

In  this  Note  we  consider  a  CER  as  a  regression  equation  that 
represents  a  relationship  of  the  form 

Y  =  X  3  +  s  , 


where  Y  is  a  vector  of  size  n,  X  is  an  n  by  p  matrix  of  explanatory 
variables,  3  is  a  vector  of  p  regression  coefficients  (including  an 
intercept),  and  e  is  an  error  vector  of  size  n.  Predicted  or  estimated 

ifc  'fc 

values  are  indicated  by  a  superscript  ,  as  in  3  to  denote  an  estimate 
of  the  3  vector  of  true  regression  coefficients.  Lower  case  letters  are 
used  for  individual  matrix  elements,  such  as  element  y^  of  vector  Y. 

After  identifying  those  engine  characteristics  expected  to  drive 
engine  costs,  we  computed  all  possible  regressions  and  screened  them  for 
various  numbers  of  select  independent  variables.  For  each  dependent 
variable,  the  regression  models  with  the  smallest  estimating  errors  were 
considered  candidates  for  further  evaluation.  Just  accepting  the  one 
model  with  the  smallest  error  was  not  appropriate,  because  some  models 
with  small  errors  were  invalid  for  other  reasons.  The  number  of  models 
actually  evaluated  was  not  the  same  for  all  cases.  We  always  selected  a 
set  of  models  large  enough  to  be  reasonably  likely  to  include  the 
overall  best  model  (along  with  at  least  a  few  other  potentially  useful 
models).  The  models  dropped  from  the  analysis  at  this  point  were  of 
necessity  inferior  to  those  that  remained.  The  Mallows'  C(p) 
statistic , [4]  a  measure  of  the  total  squared  error  (bias  plus  random), 

[4]  See  C.  Mallows,  "Some  Comments  on  C(p),"  Techometrics ,  15 
(1973),  pp.  661-675. 
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and  the  multiple  correlation  coefficient  were  used  to  help  cull  out 
those  combinations  of  characteristics  that  showed  potential  for  use  as 
independent  variables. 

The  next  step  was  to  compute  full  sets  of  statistics  for  the 
candidate  models.  The  F  statistic  for  the  model  as  a  whole  and  a 
t-statistic  for  each  estimated  coefficient  were  computed  to  test  for 
statistical  significance  at  the  10  percent  level.  Models  that  did  not 
test  out  as  statistically  significant  or  displayed  nonrandom  residual 
patterns  were  dropped  from  the  analysis. 

Three  statistical  criteria  were  used  to  evaluate  the  remaining 
CERs:  the  total  mean  square  error  of  the  estimate,  the  influence  of 
individual  data  points  or  sets  of  points,  and  collinearity .  Estimation 
error  is  usually  measured  by  the  variance  of  the  estimate,  because  we 
normally  deal  with  unbiased  estimators.  It  is  possible,  however,  for  a 
biased  estimator  to  have  a  lower  total  mean  square  error  than  an 
unbiased  estimator  if  its  variance  is  low  enough.  Additionally, 
sometimes  one  or  a  few  individual  data  observations  drive  the  values  of 
the  estimated  parameters  in  a  CER  more  strongly  than  does  the  rest  of 
the  sample.  Indeed,  even  the  set  of  variables  that  are  statistically 
significant  may  be  largely  determined  by  only  one  or  a  few  observations. 
Such  results  are  obtained  when  the  data  sample  is  not  homogeneous,  and 
the  CER  then  represents  effects  that  are  not  typical  of  the  sample  as  a 
whole,  but  rather  are  associated  with  a  peculiar  subset  of  observations. 
In  some  cases  this  subset  can  provide  useful  information.  In  others, 
the  peculiar  observations  are  bad  data  that  should  be  dropped  from  the 
analysis  so  that  a  useful  CER  can  be  developed  from  the  remaining  data. 
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Finally,  collinearity  can  increase  the  variances  of  estimated 
coefficients,  producing  large  prediction  intervals  or  masking  the 
validity  of  a  test  of  statistical  significance. 

A  number  of  diagnostic  statistics  have  been  developed  to  evaluate 
these  potential  problems.  A  thorough  discussion  of  several  of  them  is 
presented  in  a  recent  book  by  Belsley,  Kuh,  and  Welsch.[5]  Most  of  the 
diagnostics  used  in  this  evaluation  are  taken  from  that  source.  Some 
additional  helpful  material  was  found  in  a  paper  by  R.  R.  Hocking. [6] 

As  mentioned  above,  a  useful  estimate  of  a  standardized  measure  of 
total  mean  square  error  is  the  C(p)  statistic: 

Residual  Sum  of  Squares 

C(P)  =  - r  +  2p  -  n 

(Standard  Error  of  Estimate) 

When  a  large  number  of  alternative  models  are  being  considered,  C(p) 
provides  an  easily  computed  criterion  for  comparing  equations.  Several 
statistics  that  can  help  identify  influential  observations  are  listed  in 
Table  5.  Formulas  for  computing  most  of  them  are  given  by  Belsley. [7] 

DFBETAS^  measures  the  influence  of  the  ith  observation  on  the  jth 
estimated  regression  coefficient.  A  large  value,  greater  than  2//n", 
indicates  that  the  ith  observation  has  strong  influence  on  the  jth 
coefficient . 

A  similar  measure  is  DFFITS^,  which  measures  the  influence  of  the 
ith  observation  on  the  fit,  the  .estimated  value  of  the  dependent 

[5]  David  A.  Belsley,  Edwin  Kuh,  and  Roy  E.  Welsch,  Regression 

Diagnostics :  Identifying  Influential  Data  and  Sources  of  Collinearity, 

Wiley,  New  York,  1980. 

[6]  R.  R.  Hocking,  "The  Analysis  and  Selection  of  Variables  in 
Linear  Regression,"  Biometrics.  32,  March  1976,  pp.  1-49. 

[7]  Belsley  (1980). 
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Table  5 

INDICATORS  OF  INFLUENTIAL  OBSERVATIONS 


Suggested  Cutoff 

Indicator  Item  Measured  Value 


DFBETAS 

Change  in  an  estimated 
regression  coefficient 
caused  by  deletion  of  an 
observation 

2/  Vn 

DFFITS 

Change  in  fitted  value 
caused  by  deletion  of  an 
observation 

2  ]j p/n 

Cook's  Distance, 

D 

Change  in  estimated  regres¬ 
sion  coefficient  vector 
caused  by  deletion  of  an  obser 
vation 

F (p ,  n-p,  0.50) 

Hat  diagonal, 
h 

Influence  of  a  value  of 
dependent  variable  on  cor¬ 
responding  fitted  value 

2p/n 

Studentized 
Residual , 
RSTUDENT 

Estimated  normalized  residual 

2.0 

COVRATIO 

Sensitivity  of  covariance 
matrix  to  deletion  of  an 
observation 

1  ±  (3p/n) 

variable.  Values  larger  than  2  V p/n  denote  influential  cases. 

A  related  measure  is  Cook's  distance,  D,  which  measures  the  change  in 

the  entire  $  vector  because  of  deletion  of  an  observation. [8]  It  can  be 

[8]  See  R.  Cook,  "Detection  of  Influential  Observations  in  Linear 
Regression,"  Techometrics ,  19  (1977),  pp.  15-18. 
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evaluated  by  comparing  it  with  an  appropriate  F  distribution. 


Another  useful  statistic  is  h^,  the  diagonal  element  of  the  hat 


matrix: 


H  =  X(XCX)  1  XC 


where  X  is  the  transpose  of  X,  and  -1  indicates  matrix  inverse.  Each  h^ 
reflects  the  influence  of  an  observed  data  point  yi  on  the  fitted  value 
yi  .  A  recommended  cutoff  value  is  2p/n;  values  higher  than  this 
indicate  leverage  points  that  may  have  undue  influence  on  the  fit. 

The  studentized  residual,  RSTUDENT,  for  observation  i  is 
standardized  using  the  hat  matrix  element  hi  and  the  estimated  error 
variance  for  the  case  with  i  excluded  s(i): 


RSTUDENT , 


s(i)V  l-hi 


where  e^  the  straightforward  residual  y^  -  y^  .  Typically,  one  might 
select  a  magnitude  of  2.0  for  RSTUDENT  as  a  screening  criteria. 
Observations  with  larger  values  would  then  be  considered  outliers. 

Another  way  to  note  the  effect  of  a  single  observation  is  to 
compare  covariance  matrices  of  the  estimated  coefficients  with  the 
observation  and  without  it.  The  parameter  COVRATIO  is  defined  as  the 
ratio  of  the  determinant  of  the  covariance  matrix  with  the  observation 
deleted  to  the  determinant  of  the  full  covariance  matrix.  This  ratio 
can  be  shown  to  be 


COVRATIO 


n  -  p  -  1 


n  -  p 


RSTUDENT ^ 
n  -  p_ 


(1  >  \) 
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COVRATIO  is  useful  because  it  measures  changes  in  the  regression 
coefficient  variances,  which  can  be  large  even  when  neither  high 
leverage  nor  large  residuals  exist  alone.  The  matrix  can  be  considered 
insensitive  to  observations  for  which  COVRATIO  takes  on  values  within 
3p/n  of  1.0.  A  value  greater  than  1  +  (3p/n)  indicates  that  dropping 
the  observation  increases  the  mean  square  error;  values  less  than 
1  -  (3p/n)  identify  observations  that  would  decrease  mean  square  error 
if  they  were  dropped. 

These  are,  of  course,  not  all  of  the  measures  that  can  provide 
useful  information  about  influential  observations.  They  do,  however, 
address  a  variety  of  concerns  and  have  been  effective  in  revealing  new 
information  about  the  CERs. 

For  a  regression  coefficient  vector  3  given  by 

$  -  (X^)-1  XCY 

where  X  and  Y  are  the  independent  variable  data  matrix  and  dependent 
variable  vector,  one  has  associated  with  each  coefficient  (5^  an 
eigenvalue  v-^  of  the  (XcX)matrix.  A  condition  index  Cj_  can  then  be 
computed  for  each  coefficient  as 

c  =  \lv  /v.  ,  (1  «£  i  £  p) 
i  V  max  i  r 

where  v  is  the  largest  v . .  Any  condition  index  larger  than  about  30 
IQflX  1 

indicates  collinearity  that  deserves  further  attention. [9]  Because  at 


[9]  Belsley  (1980). 
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least  two  independent  variables  must  be  involved  when  col linearity 
exists,  degradation  can  be  a  problem  only  when  a  large  condition  index 
is  associated  with  a  large  proportion  of  the  variance  of  two  or  more 
coefficients.  Variance  proportions  of  0.5  or  more  are  considered  large 
for  this  evaluation.  Thus,  the  condition  index  and  a  decomposition  of 
regression  coefficient  variances  provide  an  indication  of  potential 
collinearity  problems. 


III.  ESTIMATING  RELATIONSHIPS 


This  section  gives  the  results  of  applying  our  analyical  procedure 
to  the  engine  technical  and  cost  data  bases  discussed  in  the  previous 
section.  Equations  are  presented  for  the  (1)  cost  of  development  of  an 
engine  to  the  MQT;  (2)  total  development  cost,  which  includes  MQT  and 
all  subsequent  product  improvement  during  the  life  of  the  engine;  (3) 
cumulative  average  unit  production  cost  at  1000  units;  and  (4)  Time  of 
Arrival,  the  predicted  date  for  an  engine  having  a  specific  set  of 
characteristics  passing  the  MQT.  Each  equation  is  discussed  in  detail, 
including  definitions,  background  information,  data  base,  modifications 
to  the  data  and  results. 

DEVELOPMENT  COSTS 

Two  distinct  development  costs  are  analyzed  in  this  study.  The 
development  cost  to  MQT,  MQTDEVCOST,  is  associated  with  the  endurance 
test,  after  which  the  engine  is  considered  to  be  sufficiently  developed 
for  installation  in  a  production  military  aircraft  and  is  suitable  for 
operational  use  in  the  field.  MQTDEVCOST  includes  initial  design, 
engineering,  prototype  tooling,  materials  and  fabrication,  and  assembly 
and  testing  of  components  and  complete  engines.  Not  included  are  any 
costs  associated  with  demonstrator  programs  that  may  have  provided  the 
basic  technology  required  to  develop  the  new  engine  or  any  costs  of 
production  tooling  associated  with  the  procurement  phase.  No  flight 
test  engines  are  included. 

The  cumulative  cost  of  development  of  all  series  of  a  particular 
engine  model  through  some  number  of  production  engines  is  designated 
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TOTDEVCOST,  which  includes  the  expenses  involved  in  developing  a  new 
engine  to  MQT,  as  outlined  above,  plus  the  costs  to  correct  service- 
related  deficiencies,  and  costs  for  continued  performance  and 
reliability  improvement  over  time.  A  performance-improved  model  must 
pass  an  additional  endurance  test  at  the  higher  performance  level..  The 
cost  of  continued  development  beyond  MQT  for  engines  that  are  in 
production  over  several  years  can  exceed  the  cost  of  development  up  to 
MQT.  This  is  illustrated  in  Fig.  1,  which  plots  percentage  of  cost  to 
MQT  versus  percentage  of  time  to  MQT  for  eight  engine  programs. 

Development  Cost  to  MQT 

In  addition  to  deriving  a  CER  based  on  the  criteria  presented 
above,  we  present  a  discussion  of  the  diagnostic  statistics  for  the 
derived  model.  This  example  should  aid  in  interpretation  of  the 
diagnostic  statistics  for  the  other  models,  which  are  displayed  in  the 
appendix. 

Cost  Estimating  Relationship.  The  preferred  equation  and  relevant 
statistics  for  development  costs  to  MQT  are  shown  in  Table  6.  The  16 
data  points  used  to  derive  this  equation  represent  16  separate 
development  programs . [ 1 ] 

The  estimating  relationship  has  three  explanatory  variables: 
maximum  thrust,  Mach  number,  and  turbine  inlet  temperature.  These 
variables  have  intuitive  appeal  as  well  as  statistical  significance. 

Maximum  thrust  can  be  considered  a  measure  of  the  physical  size  of 
the  engine.  More  than  half  of  the  cost  of  developing  an  engine  is  for 

[1]  Before  analysis,  we  dropped  one  engine,  the  TF41,  from  the 
engine  data  base.  This  engine  is  an  advanced  version  of  a  British 
engine,  the  RB. 168.25.  Its  development  is  better  characterized  as  a 
performance  upgrade  rather  than  a  full  development. 
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Fig.  1 — Total  development  cost  versus  time  for  selected  aircraft 

turbine  engines 

SOURCE:  J.  R.  Nelson  and  F.  S.  Timson,  Relating  Technology  to  Acquisition 
Cost:  Aircraft  Turbine  Engines.  The  Rand  Corporation,  R-1288-PR, 
March  1974. 


AIRCRAFT  TURBINE  ENGINE  DEVELOPMENT  COST  TO  MQT 
(16  turbojet  and  turbofan  engines) 


MQTDEVCOSTa=  -845.804  +  .005  (THRMAX) 

( .  074)b 

+  249.838  (MACH)  +  0.313  (TEMP) 

(.000)  (.001) 

R  2  =  .93 

SE  =  84.7 

F  =  54.6 

Note:  MACH  has  a  value  of  1  for  engines  designed 
for  subsonic  flight. 

Millions  of  FY  1980  dollars. 

^Level  of  significance. 

test  hardware;  and  as  an  index  of  engine  size,  thrust  reflects  the 
cost  of  hardware.  Engine  development  programs  will  use  as  many  as  50 
test  engines  and  equivalent  spares. 

The  Mach  number  can  be  considered  an  indicator  of  the  environment 
in  which  the  engine  must  operate,  and  the  operational  environment  is  a 
strong  determinant  of  the  amount  of  testing  required.  More  than  one- 
quarter  of  the  cost  of  the  development  program  is  associated  with 
testing. 

Turbine  inlet  temperature  is  the  most  important  variable  in  the 
analysis.  One  obvious  explanation  for  the  statistical  significance  of 
turbine  inlet  temperature,  even  after  the  major  performance  parameters 
have  been  included  in  the  equation,  is  that  temperature  plays  a  dominant 


27 


role  in  the  thermodynamics  of  engines;  and  consequently,  a  major 
development  goal  has  been  ever-higher  temperatures.  These  higher 
temperatures  have  been  one  of  the  chief  sources  of  improved  performance 
as  measured  by  the  major  performance  parameters.  Although  this  by 
itself  is  important,  the  fact  that  turbine  temperature  serves  as  a  proxy 
for  many  major  and  minor  parameters,  as  well  as  for  material  content, 

should  not  be  overlooked.  Because  turbine  inlet  temperature  is  also 

2 

highly  correlated  with  time  (R  =  .  9) ,  it  serves  as  a  substitute  for  a 
time  term.  (An  earlier  Rand  study  shows  the  turbine  inlet  temperature 
has  increased  at  an  average  rate  of  about  35  to  40  deg.  R  per  year.) [2] 

The  results  for  this  equation  are  displayed  graphically  in  Fig  2. 

The  45 -degree  line  represents  the  average  trend  or  expected  value  for 
MQTDEVCOST  over  the  period.  The  points  represent  the  calculated 
(predicted)  versus  actual  costs  for  the  engines  in  the  data  base.  The 
calculated  MQTDEVCOST,  which  is  determined  by  inserting  an  engine's 
characteristics  into  the  equation,  is  plotted  on  the  vertical  axis.  The 
horizontal  axis  shows  the  actual  cost  through  MQT.  The  scatter  of  the 
residuals  about  the  45-degree  line  does  not  appear  to  violate  any 
assumption  usually  made  about  th  distribution  of  errors. 

Regression  Diagnostics .  Regression  diagnostics  have  helped  us  in 
many  ways.  First,  they  have  flagged  improperly  transcribed  data,  as 
well  as  inaccurate  or  incomplete  data.  Second,  they  have  confirmed  our 
belief  that  several  engines--F100,  TF39,  J58--are  ;  jfficiently  di  -ent 
from  the  others  that  their  inclusion  adds  valuable  information  and 
broadens  the  applicability  of  the  model. 

[2]  J.  R.  Nelson  et  al.,  "Future  V/STOL  Airplanes:  Guidelines  and 
Techniques  for  Acquisition  Program  Analysis  and  Evaluation,"  The  Rand 
Corporation,  N-1242-PA&E,  October  1979. 
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Fig.  2 --Development  cost  (HQT) ,  preferred  equati 


In  the  discussions  that  follow,  regression  diagnostics  are  used  to 
identify  data  points  that  have  a  disproportionate  influence  on  the 
MQTDEVCOST  model  and  to  determine  which  elements  of  the  model  are 
influenced  most  by  these  data  points.  In  determining  cutoff  values,  we 
use  rules  of  thumb  suggested  by  Belsley,  Kuh,  and  Welsch. 

We  begin  the  discussion  of  regression  diagnostics  with  the 
condition  index.  The  maximum  condition  index  for  this  equation  is  21.7, 
which  is  below  the  established  cutoff  value  of  30.  Although  a  small 
condition  index  indicating  degrading  collinearity  is  absent,  the 
presence  of  a  high  condition  index  alone  is  a  necessary  but  not 
sufficient  condition  to  reject  a  model.  Most  analysts  also  require  high 
variance  proportions  (say,  greater  than  0.5)  for  two  or  more  estimated 
regression  coefficient  variances.  When  both  the  condition  index  and 
variance  proportions  exceed  these  cutoff  values,  they  provide  a  measure 
of  the  degree  to  which  the  model  has  been  degraded  by  collinearity. 

Table  7  shows  studentized  residuals,  hat  diagonal,  covariance 
ratio,  DFFITS,  DFBETAS,  and  Cook's  distance  values  for  each  data  point 
used  in  the  regression.  (Statistics  that  approach  or  exceed  their 
cutoff  values  are  underlined.)  Before  beginning  the  analysis  we  expected 
the  F100,  as  well  as  the  TF39  and  J58  engines,  to  be  flagged  by  the 
diagnostic  statistics.  The  F100  should  be  an  influential  observation 
because  it  is  the  most  technically  advanced  engine  in  the  sample,  having 
the  highest  turbine  inlet  temperature  and  thrust  to  weight  ratio.  The 
TF39  should  also  be  identified  as  an  influential  observation  because, 
although  it  is  a  subsonic  engine,  it  has  the  highest  thrust  rating  of 
all  the  engines  in  the  sample.  The  large  thrust  output  is  due  to  its 
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large  size  and  the  fact  that  much  of  its  thrust  is  generated  in  a 
mode  quite  different  from  the  other  engines  in  the  sample.  Also,  it  is 
the  only  large,  transport  type  of  engine  in  the  data  base.  The  third 
expected  outlier  is  the  J58.  This  engine  is  the  only  engine  in  the 
sample  designed  for  a  high  altitude,  high  speed  reconnaissance  mission, 
which  requires  a  considerably  different  design  and  testing  approach. 

A  routine  analysis  of  residuals  only  identified  the  TF30  as  being 
an  outlier.  The  diagnostic  statistics  do  a  better  job  in  this  regard 
(see  Table  8).  This  table  shows  those  engines  identified  as  being 
influential  by  the  regression  disgnostics:  An  X  indicates  that  the 
suggested  cutoff  value  for  a  given  statistic  (column  heading)  is 
approached  or  exceeded  by  a  particular  engine  (row  heading) .  For  many 
statistics  the  F100  is  flagged  as  being  an  important  data  point  in 
influencing  the  regression  coefficients .  In  addition,  several  of  the 
regression  diagnostics  identify  the  TF39  as  being  potentially  different 
from  other  engines  in  the  sample.  The  J58  has  the  highest  Cook's 
distance  value  and  conspicuously  exceeds  the  DFFITS  cutoff  value  and 
approaches  the  limit  established  for  the  hat  matrix  diagonal.  Two  of 
the  four  DFBETAS  cutoff  values  are  exceeded  as  well.  The  results  of 
each  diagnostic  are  discussed  individually  below. 
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Student ized  residual .  An  examination  of  the  studentized  residuals 
indicate  the  TF30,  the  first  turbofan  engine  with  an  afterburner,  exceeds 
cutoff  value.  The  distribution  of  these  residuals  do  not  differ  greatly 
from  the  Gaussian  (normal). 

Hat-matrix  diagonal .  Using  a  cutoff  value  of  .50,  two  engines-- 
the  TF39  and  J85--show  indications  of  being  multivariate  outliers. 

The  importance  of  this  fact  depends  on  the  values  of  the  other 
diagnostics  for  those  engines.  However,  we  recognize  that  they 
have  unique  attributes  and  have  decided  that  their  inclusion 
is  necessary  to  make  the  model  broadly  applicable. 

Covariance  ratio.  Using  the  formula  given  in  Table  5,  we 
calculate  cutoff  values  outside  the  range  of  0.2  to  1.8.  Engines 
outside  this  range  can  be  considered  extreme,  affecting  the 
efficiency  of  coefficient  estimation.  Four  engines  are  outside  this 
range--F404,  TF33,  TF39,  and  J85.  Of  these,  the  TF39,  which 
has  already  been  identified  as  a  possible  influential  observation, 
has  the  largest  deviation  from  the  cutoff  value.  The  remaining 
three  engines  have  values  essentially  equal  to  the  higher  suggested 
cutoff. 

DFFITS .  DFFITS  shows  how  the  regression  coefficient 

will  change  when  the  case  is  deleted  from  the  model.  This  diagnostic 
identifies  two  engines,  F100  and  J58,  both  of  which  have  been  hypothesized 
as  potentially  influential  observations. 

DFBETAS .  This  diagnostic  measures  the  influence  of  each  data 
point  on  the  individual  coef fficients .  It  is  clear  from  the 
DFBETAS  that  the  F100,  TF39,  and  J58,  along  with  the  F101, 
are  very  influential  data  points,  as  expected. 

Cook's  distance.  Cook's  distance  provides  a  method  for  examining 
the  change  in  the  estimate  of  3  relative  to  the  usual 
confidence  measures  when  a  single  case  is  deleted.  Again,  the  F100 
and  J58  have  the  largest  values. 


As  we  initially  expected,  the  F100,  TF39 ,  and  J58  proved 
consistently  to  be  the  most  influential  engines  in  this  model.  Because 
future  engines  are  expected  to  be  more  like  these  than  the  others,  these 
engines  have  remained  in  our  sample. 

Unlike  many  CERs,  this  equation  is  linear  rather  than  exponential 
(log-linear).  Our  preference  for  a  linear  equation  comes  from  higher 
overall  statistical  qualities  in  the  traditional  measures--F-test ,  level 
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of  significance  of  the  independent  variables  (t-test),  and  coefficient 
of  determination  (R  )--as  well  as  for  the  regression  diagnostics  and 
residual  plots.  Indeed,  in  the  log-linear  format  some  independent 
variables  tested  as  not  significant,  and  the  condition  index  exceeded 
our  criteria,  indicating  a  degree  of  collinearity . 

Total  Development  Cost 

In  estimating  total  development  costs  the  preferred  equation,  in 
addition  to  thrust  and  Mach  number,  includes  a  production  quantity  term. 
This  variable  is  intuitively  appealing,  because  the  amount  of  resources 
devoted  to  improving  a  particular  engine  model  through  retrofitting 
should  be  related  to  the  quantity  of  engines  that  are  produced  and 
operating  in  the  field  as  well  as  its  technical  characteristics. 
Temperature  could  serve  as  an  additional  explanatory  variable  but  is  not 
included  here  to  keep  the  number  of  independent  variables  to  a  minimum. 

With  this  equation,  the  independent  variables  used  to  predict,  say, 
the  costs  through  the  2000th  unit,  have  values  that  reflect  the 
per formance/techno logy  inherent  in  the  2000th  engine.  Thus,  in  the 
absence  of  any  change  in  the  performance/technology  level  in  that 
engine,  the  total  development  cost  reflects  efforts  devoted  to  improving 
reliability  and  durability,  which  is  related  to  the  quantity  of  engines 
produced.  However,  when  performance  is  upgraded,  the  equation  captures 
both  the  costs  of  reliability  afnd  performance  improvement.  The  equation 
and  a  summary  of  its  statistical  properties  is  shown  in  Table  9.  Figure 
3  graphically  displays  the  results. 

A  rule  of  thumb  in  the  industry  is  that  development  costs  double 
between  MQT  and  2000  engines  and  the  data  substantiate  this  rule.  Of 
course,  in  our  approach  we  are  hoping  to  capture  underlying  trends  in 
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Table  9 

TOTAL  ENGINE  DEVELOPMENT  COST 
(29  turbojet  and  turbofan  engines) 


I  - 

TOTDEVCOST3  =  -  525.763  +  0.023  (THRMAX) 

(.000)b 

+  401.022  (MACH)  +  0.070  (QTY) 

(.000)  (.002) 


R  2  =  .79 

SE  =  196.1 
F  =  31.0 

Durbin  Watson  =  1.9 


aMillions  of  FY  80  dollars. 
^Level  of  significance. 


the  development  process,  with  the  expectation  such  trends  will 
continue.  Development  costs  do  continue  after  initial  qualification  of 
an  engine.  Some  allowance  for  these,  even  though  imprecise,  is 
essential  in  financial  planning. 

The  engines  that  exceed  the  established  cutoff  criteria  for  the 
diagnostic  statistics  are  identified  in  Table  10.  The  F100,  J79,  and 
J57  are  consistently  influential  engines.  The  latter  two  engines  were 
produced  in  very  largt  quantities  and  have  the  largest  values  for  the 
QTY  variable  of  all  the  engines  in  our  sample.  Far  fewer  FlOOs  have 
been  produced  to  date  but  the  total  development  cost  of  that  engine 
greatly  exceeds  the  cost  of  all  other  engines. 


500  1000  1500  2000 
Actual  total  development  cost  ($  millions) 


Fig.  3 — Total  development  costs 


A  potential  problem  in  using  a  series  of  observations  for  each 
engine  to  represent  the  cumulative  costs  and  quantities  at  the  end  of 
successive  years  is  that  one  of  the  basic  assumptions  of  regression 
theory  might  be  violated--that  the  errors  in  the  successive  data  points 
are  independent  of  each  other.  The  failure  of  this  condition  is  called 
serial  (or  auto)  correlation,  and  its  effect  is  to  invalidate  the 
standard  error,  student  t,  and  F  statistics  relating  to  confidence 
measures  for  the  equation  and  its  coefficients.  (The  resulting 
coefficients  are  still  maximum  likelihood  estimates,  but  the  variances 
are  understated.  That  is,  the  standard  error,  t,  and  F  statistics  are 
not  as  good  as  indicated.)  In  the  present  case,  examination  of  the 
residuals  and  the  value  of  the  Durbin-Watson  test  statistic[3]  indicates 
that  serial  correlation  is  not  a  problem. 

The  second  potential  difficulty  inherent  in  this  procedure  is  that 
individual  engines  with  more  observations  than  others  will  have  a 
stronger  influence  on  the  outcome.  The  least  squares  procedure 
minimizes  the  sum  of  the  squares  of  all  residuals  counted  equally. 
Unfortunately,  alternative  procedures  that  were  tried  reduced  the  sample 
size  to  such  a  degree  that  the  analytical  outcome  proved  meaningless. 

PRODUCTION  COSTS 

As  stated  previously,  engine  production  costs  used  in  this  study 
reflect  the  selling  price  to  the. government  adjusted  to  FY  1980  dollars. 
The  production  CER  uses  cumulative  average  price  at  the  1000th  unit  as  a 
dependent  variable.  These  are  obtained  from  progress  curves  derived 
from  historical  cost  and  quantity  data. [4]  This  price  includes  direct 

[3J  J.  Durbin  and  G.  S.  Watson,  "Testing  for  Serial  Correlation  in 
Least  Squares  Regression  II,"  Biometrika ,  37,  1951,  pp.  409-429,  and 
Vol .  38,  pp.  159-178. 

[4]  In  keeping  with  the  DAPCA  formulation  of  engine  cost,  no 


and  indirect  labor  costs,  material  costs,  tooling,  technical 
publications,  field  service,  G&A  expense,  indirect  component 
improvement,  contributing  engineering  and  IR&D,  and  profit.  Because  of 
the  changing  nature  of  costs  in  the  aircraft  turbine  engine  industry, 
there  is  a  question  whether  an  equation  based  on  production  experience 
of  the  1950s,  1960s,  and  1970s  will  be  able  to  predict  costs  for  the 
1980s.  Because  the  intent  is  to  use  these  cost  equations  to  predict 
aircraft  turbine  engine  experience  in  the  1980s  and  even  1990s,  any 
changes  that  are  radically  different  from  the  past--such  as  methods  of 
production,  plant  production  capacity,  production  rate, [5]  differences 
in  overhead  rate,  and  changes  in  contract  add-ons --must  be  reflected  in 
any  cost  assessment.  We  believe,  however,  that  these  data  represent  the 
continuation  of  an  evolutionary  process  taking  place  over  the  past  30 
years  and  that  this  trend  is  inherently  captured  in  the  existing  models. 

The  1000th  cumulative  average  unit  production  cost[6]  is  used  as 
the  dependent  variables  in  our  regression  analyses  and  was  chosen  for 
several  reasons.  First,  cumulative  average  values  are  easy  to  convert 
to  total  production  cost.  Second,  it  gives  much  better  statistical 


estimating  relationship  has  been  developed  for  production  learning  curve 
slope.  DAPCA  uses  a  default  value  of  0.90  for  the  cumulative  average 
curve.  In  the  data  base  used  in  this  study,  all  but  three  of  the  22 
engines  had  slopes  of  0.80  or  greater.  A  subsample  of  six  turbofan 
engines  had  slopes  ranging  upward  from  0.87,  with  an  average  slope  of 
0.93. 

[5]  Because  production  rate  is  not  known  with  precision  during  the 
concept  formulation,  we  could  not  include  it  among  the  explanatory 
variables.  Nevertheless,  for  estimating  the  costs  of  engines  in 
production,  it  has  proven  useful.  See-M.  Crazur  and  E.  McGann,  "An 
Investigation  of  Changes  in  Aircraft  Engine  Production  Rate,"  School  of 
Systems  and  Logistics,  Air  Force  Institute  of  Technology,  WPAFB,  Dayton, 
September  1979. 

[6]  This  cost  does  not  include  any  development  monies,  either  pre- 
or  post-MQT. 
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results  than  the  first  unit  cost.  Furthermore,  production  should  be 
stabilized  by  the  time  several  hundred  units  have  been  produced. 

Engine  production  cost  is  a  function  of  the  material  content  of  the 
engine.  Advances  in  engine  performance  are  made  possible,  in  a  large 
part,  through  lighter,  stronger,  and  more  exotic  materials.  Therefore, 
any  production  CER  should  use  independent  variables  that  reflect  an 
engine's  material  content.  The  best  equation  contains  thrust  (size), 
Mach  number  (performance/technology),  and  turbine  inlet  temperature 
(technology/time).  See  Table  11  and  Fig.  4. 

Production  cost  is  obviously  a  function  of  engine  size  because 
large  engines  require  more  material  and  labor  than  do  smaller  engines. 
The  thrust  variable  captures  this  effect.  Mach  number  is  also  an 
intuitively  appealing  independent  variable.  High  Mach  number  engines 
require  advanced  materials  because  of  the  high  temperatures  and 


Table  11 

AIRCRAFT  TURBINE  ENGINE  PRODUCTION  COST 
(22  turbojet  and  turbofan  engines) 


PROCOST3  =  -  2228.140  +  0.043  (THRMAX) 

(.000)b 

+  243.250  (MACH)  +  0.969  (TEMP) 
(.008)  (.000) 


R  2  =  .96 

SE  =  183.6 
F  =  135.9 


3 Thousands  of  FY  80  dollars. 
b Level  of  significance. 
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pressures  generated  throughout  the  engine  during  sustained 
supersonic  flight.  The  turbine  inlet  temperature  is  perhaps  the  best 
single  indicator  of  engine  materials.  Generally,  a  temperature  of  less 
than  1900  deg.  R  implies  a  predominantly  steel  engine;  higher 
temperatures  imply  a  high  proportion  of  advanced  materials.  Because 
these  advanced  materials  are  more  expensive  to  buy  and  more  difficult 
and  time  consuming  to  machine,  their  use  results  in  a  more  expensive 
engine. [7] 

In  addition  to  having  a  sound  rationale,  the  equation  meets  our 
statistical  criteria.  Those  engines  that  exceed  the  cutoff  criteria  for 
the  diagnostic  statistics  are  identified  in  Table  12.  This  matrix  shows 
the  J75,  F404,  and  the  TF39  as  the  most  influential  data  points.  The 
TF39  has  been  consistently  identified  as  an  outlier.  The  model 
considerably  overpredicts  production  costs  for  the  J75  engine.  Higher 
estimated  values  result  because  its  thrust  rating  is  not  consistent  with 
other  engines  in  the  sample  having  a  similar  level  of  technology,  as 
indicated  by  turbine  inlet  temperature.  The  J75  has  one  of  the  lowest 
turbine  inlet  temperatures  and  one  of  the  highest  thrust  ratings.  This 
thrust  is  obtained  by  sheer  engine  size,  as  indicated  by  engine  weight, 
5950  lb.  In  addition,  the  J75  derives  from  the  J57  and  many  parts  are 
common  to  both  engines.  When  an  engine  has  more  than  the  usual  degree 
of  commonality,  its  costs  may  be  lower  than  would  otherwise  be  expected. 

The  F404,  has  a  fairly  high  turbine  inlet  temperature,  indicating  a 
technically  advanced  engine.  It  is  unlike  the  other  advanced  engines  in 
the  sample,  however,  in  that  it  is  a  small  engine  with  a  high  thrust 

( 7 ]  T.  J .  Brennan  et  al . ,  Cost  Estimating  Techniques  for  Advanced 
Technology  Engines ,  SAE  Paper  700271,  April  1970. 
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rating.  (The  F404  has  about  the  same  thrust  as  the  J79  at 
considerably  less  weight.) 

All  have  been  retained  in  the  sample  because  they  are  valid 
observations  that  provide  useful  information  about  the  nature  of  engine 
production  cost  and  technology. 

TIME  OF  ARRIVAL 

Technology  is  not  a  directly  measurable  quantity,  so  substitute 
measures  have  been  sought.  One  of  the  most  successful  has  been  the  time 
of  arrival  (TOA)  approach,  which  uses  multiple  regression  to  relate  the 
date  of  an  engine's  successful  completion  of  its  MQT  to  certain  of  its 
technical  characteristics.  This  study  uses  the  same  approach.  The  TOA 
method  is  provided  to  help  the  estimator  discern  the  level  of  risk 
associated  with  a  particular  engine. 

The  value  given  by  the  multiple  regression  equation  is  the  date 
when  an  engine  with  a  specified  set  of  technical  characteristics  is 
expected  to  pass  its  MQT.  TOAs  are  measured  in  quarters  of  a  year 
beginning  with  zero  as  the  third  quarter  in  1942.  The  difference, 
calculated  TOA  -  Actual  TOA  =  A  TOA,  is  the  interval  between  the  time 
when  an  engine  is  predicted  to  pass  its  MQT  and  the  time  it  actually 
completes  it. 

The  equation  that  best  represents  the  military  trend  of 
technological  tradeoff  contains  three  variables  (see  Table  13).  This 
equation  differs  from  earlier  equations  in  that  all  the  independent 
variables  come  from  the  performance/technology  category.  Such  a 
selection  is  appropriate  because  although  TOA  is  a  measure  of  time,  it 
is  a  function  of  the  performance/technology  characteristics  of  turbine 


engines . 


In  calendar  quarters  since  the  third  quarter  of  1942. 
b 

Level  of  significance. 

The  equation  contains  the  three  most  sought  after  characteristics 
in  the  turbine  engine  development  process,  and  the  sign  of  each 
coefficient  is  consistent  with  intuitive  notions  of  what  constitutes 
higher  technological  achievement.  The  thrust  to  weight  ratio  (THRWGT) 
and  turbine  inlet  temperature  (TEMP)  have  positive  coefficients, 
indicating  growth  over  time,  while  specific  fuel  consumption  (SFC)  is 
more  highly  valued  as  it  is  reduced.  A  graphical  representation  is 
plotted  in  Fig.  5.  The  45-degree  line  represent  the  average  trend  or 
expected  date  of  MQT  over  the  period.  Points  plotted  above  the 
45-degree  line  represent  engines  "ahead  of  their  time";  that  is,  engines 
with  characteristics  yielding  TOAs  greater  than  their  actual  MQT  dates 
appear  earlier  than  predicted.  Likewise,  points  below  the  line  are 
"late"  or  "conservative"  developments. 
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Actual  TOA  (quarters) 

Fig.  5--T0A,  preferred  equation 
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There  are  many  possible  reasons  for  deviations  about  the  trend. 

The  equation  therefore  cannot  be  used  for  making  fine  distinctions,  but 
if  certain  points  or  trends  deviate  sharply  from  the  average,  it  should 
be  possible  to  distinguish  them.  For  example,  two  recent  engines,  the 
F100  and  F404,  deviate  from  the  trend  line  by  more  than  one  standard 
error. 

The  model  shows  the  F100  engine  to  be  ahead  of  its  time  by  almost 
14  quarters,  and  that  is  well  recognized  within  the  propulsion 
community.  However,  the  model  shows  the  F404  to  be  "late"  by  almost  20 
quarters.  This  tends  to  confirm  the  manufacturer  and  Navy's  claim  that 
they  deliberately  "backed  off"  from  the  ultimate  performance  in 
developing  the  F404,  believing  that  a  reduction  of  5  to  10  percent  from 
the  theoretically  possible  thrust  to  weight  ratio  and  specific  fuel 
consumption  would  make  possible  a  major  reduction  in  complexity  and 
maintenance  requirements . 

The  F404  and  the  F100,  along  with  the  TF39,  which  was  the  first 
large  turbofan  engine,  are  the  most  influential  data  points  in  the  TOA 
equation  (see  Table  14). 

It  should  be  stressed  that  these  models  represent  the  time  of 
arrival  of  a  demonstrated  level  of  performance,  which  is  assumed  to 
represent  the  best  efforts  of  the  aircraft  turbine  engine  industry;  thus 
it  is  considered  to  be  the  technological  state  of  the  art  of  U.S. 
aircraft  turbine  engines.  When  an  engine  is  predicted  to  be  greatly 
advanced  there  is  a  good  chance  that  its  schedule  will  slip  and  costs 
will  increase.  Whenever  the  difference  between  the  estimated  and 
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planned  dates  exceeds  the  standard  error  of  the  TOA  equation  (8.6 
quarters),  the  development  cost  estimate  should  be  adjusted.  Such 
revised  estimates  are  best  made  in  the  context  of  the  program  by 
estimators  who  have  a  sense  of  history  and  are  familiar  with  program 
details . 

CAUTIONS 

There  are  several  cautionary  points  regarding  the  input  data. 

First,  they  must  reflect  the  maximum  capability  of  the  engine,  not  the 
aircraft  in  which  it  is  to  be  installed.  Second,  the  engine  performance 
data  must  be  consistent  in  terms  of  the  thermodynamic  cycle.  Third, 
data  at  growth  points  require  some  sort  of  forecast,  a  fact  that  has  not 
been  analyzed  in  the  present  study.  The  problem  is  to  determine  an 
improved  technology  level  after  some  quantity  of  engines  has  been 
produced.  If  only  a  few  engines  are  produced,  technology  may  not  be 
uprated.  If  many  engines  are  produced,  technology  will  probably  be 
uprated.  Fourth,  predictions  can  be  made  with  greater  confidence  when 
the  parameter  values  for  a  new  engine  fall  within  the  range  of  the 
sample  data.  This  may  occur  for  some  new  developments;  however,  it  will 
probably  not  occur  for  at  least  the  technology  parameters.  As  a  guide 
to  the  estimator,  Table  15  shows  the  ranges  of  the  input  data  for  all 
equations  presented  in  this  report. 


MEANS  AND  RANGES  OF  INPUTS  TO  FINAL  REGRESSION  ANALYSIS 
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IV.  OBSERVATIONS 


The  statistics  for  the  estimating  relationships  described  in  the 
preceding  section  give  evidence  that  these  relationships  provide 
improvements  in  engine  cost  estimating  capability  over  the  DAPCA 
equations.  Major  strong  points  of  these  relationships  are  intuitive 
appeal,  ease  of  use,  fewer  independent  variables,  and  low  estimating 
error.  For  the  three  most  recent  engines  Table  16  shows  percent 
deviations  from  observed  values.  Also,  we  have  insight  into  the 
influence  of  one  or  a  few  engines  in  the  data  base  on  the  derived 
equations.  The  passing  of  time  has  seen  additional  engine  development 
and  production,  which  have  yielded  useful  data  that  have  been  added  to 
the  data  base  so  that  it  represents  a  wider  range  of  engine 
characteristics . 


Table  16 

PERCENT  DEVIATIONS  OF  OBSERVED  VERSUS  CALCULATED  VALUES 


Engine 

MQTDEVCOST 

TOTDEVCOST 

PROCOST  TOA 

F100 

-11 

-26a 

-6  10 

F101 

-12 

-- 

-7  2 

F404 

6 

-- 

+19  -14 

Note:  Negative  values  are  underestimates 


positive  values  are  overstimates . 
aThrough  1700  units. 
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The  equation  form  selected  represents  a  departure  from  the  more 
traditional  estimating  relationships.  Our  equations  are  linear  in  both 
the  independent  and  dependent  variables  (linear-linear  form),  whereas 
some  earlier  studies  used  logarithmic  transformations.  We  investigated 
four  different  equation  forms:  linear-linear,  log-linear,  linear-log 
and  log- log;  each  equation  form  has  its  own  implication  for  technology 
and  cost  trends. [1]  Forms  other  than  linear-linear  each  had  at  least 
one  of  the  following  drawbacks: 

-  candidate  variables  were  not  significant 

-  coefficients  had  counterintuitive  signs 

-  CER  had  large  estimating  error 

-  independent  variables  exhibited  collinearity 

-  older  engines  exerted  considerable  influence 
on  predicted  values. 

Thus  our  initial  criteria  and  analytical  results  caused  us  to  select  the 
linear- linear  format  as  being  best  suited  for  predicting  future  engine 
costs . 

Stability  of  estimating  relationships  is  revealed  by  the  regression 
diagnostics.  For  all  the  preferred  equations,  highly  influential  data 
points  were  judged  to  have  characteristics  representative  of  uture 
engines.  They  were  therefore  included  when  the  final  estimating 
relationships  were  developed  because  they  are  valid  observations 
providing  useful  information  about  the  nature  of  engine  costs  and 
technology. 

(1]  W.  L.  Stanley  and  M.  Miller,  Measuring  Technological  Change  in 
Jet  F ighter  Aircraft ,  The  Rand  Corporation,  R-2249-AF,  September  1979, 

pp.  16-18. 
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The  diagnostic  statistics  provide  information  useful  in  considering 
the  effects  of  these  influential  observations  upon  estimated  costs. 

Every  observation  with  strong  influence  on  the  fit  also  has  strong 
influence  on  the  coefficients  of  one  or  more  explanatory  variables.  The 
contributions  of  these  variables  to  the  cost  or  TOA  estimate  are 
especially  sensitive  to  the  influential  observation.  The  cost  estimator 
can  tell  whether  a  variable  will  be  a  problem  in  his  planned  application 
of  the  estimating  relationship,  depending  upon  whether  the  engine  whose 
cost  is  to  be  estimated  most  closely  resembles  the  influential  engine  or 
the  other  observations. 

NOTEWORTHY  ENGINES  AND  VARIABLES 

Table  17  is  a  summary  of  the  engines  that  deserve  special  note. 

The  F100  strongly  influences  the  coefficients  in  three  models.  The  TF39 
and  F404  are  also  influential  data  points;  however,  different  models  are 
affected  by  individual  engines  to  different  degrees,  and  other  engines 


Table  17 

NOTEWORTHY  ENGINES 


Dependent 

Variable 

Statistically 
Influential  Engines 

MQTDEVCOST 

F100, 

J58 

TOTDEVCOST 

F100, 

J79 ,  J57 

PROCOST 

F404, 

TF39 .  J75 

TOA 

F100, 

F404 ,  TF39 
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stand  out  as  atypical  for  the  different  models.  It  should  be  recalled 
that  not  all  engines  are  in  the  samples  for  all  models,  and  this  may 
account  for  some  of  the  variation  from  one  model  to  another. 

Table  18  lists  the  explanatory  variables  that  showed  up  most  often 
in  models  with  good  statistical  properties.  THRMAX,  MACH  and  TEMP  were 
important  for  all  the  models.  TOA  itself  is  not  in  any  of  the  preferred 
relationships.  The  time  of  arrival  is  thus  less  important  now  as  a 
driver  of  engine  costs  than  it  was  in  the  past,  probably  because  of  the 
expanded  data  base  used  for  this  study. 

COMPARISON  WITH  DAPCA 

Making  a  fair  comparison  with  the  earlier  DAPCA  model  is  not  an 
easy  task.  We  have  benefit  of  more  information  and  improved  statistical 
methods,  both  of  which  were  not  available  when  the  original  DAPCA 
equations  were  derived.  Nevertheless,  a  comparison  is  needed  to  provide 
a  measure  of  progress. 


Table  18 

MOST  FREQUENTLY  OBSERVED  EXPLANATORY  VARIABLES 


Dependent  Variable 

Explanatory  Variable 

MQTDEVCOST 

MACH,  TEMP,  THRMAX, 

THRWGT ,  WGT 

TOTDEVCOST 

MACH,  QTY,  TEMP,  THRMAX 

THRWGT 

PROCOST 

MACH,  TEMP,  THRMAX,  THRWGT 

TOA 

SFCMIL,  TEMP,  THRWGT, 

THRMAX,  TOTPRS 
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Comparing  R  and  F  values  of  the  derived  equations  and  the  DAPCA 
model  is  not  useful  because  of  different  expressions  of  the  dependent 
variable  and  differing  numbers  of  independent  variables.  Therefore,  we 
use  historical  simulation  to  compare  our  model  with  DAPCA.  Historical 
simulation  is  based  on  the  idea  that  if  the  CERs  have  merit  now,  the 
same  relationships,  derived  from  the  then  available  data  base,  should 
have  worked  in  the  past.  The  historical  simulation  technique  is 
implemented  by  removing  the  most  recent  engines  from  our  data  base  and 
then  performing  a  regression  analysis  on  the  reduced  sample.  The 
regression  coefficients  obtained  are  then  used  to  predict  the  costs  and 
TOA  for  the  omitted  engines . 

This  technique  was  used  for  our  equation  to  predict  costs  and  TOA 
for  the  F100,  F101,  and  F404.  Because  the  DAPCA  data  base  did  not 
include  these  engines,  predictions  were  also  obtained  using  the  DAPCA 
equations  directly. 

Figure  6  illustrates  the  results  of  this  comparison.  The  results 
confirm  that  DAPCA  generally  underestimates  costs  for  new  engines;  the 
exception,  of  course,  is  the  F404  for  which  DAPCA  grossly  overestimated 
development  costs.  The  new  model  also  tends  to  underestimate  costs  for 
these  engines  but  does  not  experience  the  wide  variation  in  estimating 
error  that  DAPCA  exhibits,  particularly  for  the  development  and 
production  cost  estimates.  Unfortunately,  total  development  cost  data 
were  unavailable  for  the  F404  and  F101,  so  a  comparison  was  made  for  the 
total  development  costs  of  the  F100  through  1700  engines.  In  this  case, 
DAPCA  predicts  better  than  the  new  equation,  even  though  it  would  not 


have  satisfied  our  model  selection  criteria.  Essentially  both  models 
predict  TOA  equally  well.  In  summary,  then,  historical  simulation 
reveals  that  the  new  model  does  as  well  as  DAPCA  or  better. 

The  new  model  uses  few  explanatory  variables  and  meets  more 
stringent  statistical  criteria  than  DAPCA.  For  the  most  important  cost 
categories--costs  through  MQT  and  production--the  new  model  predicts 
with  less  error. 

IMPLICATIONS  OF  TOA 

The  TOA  methodology  measures  the  technology  trend  not  directly,  but 
as  a  time  trend.  It  assumes  that  engine  technology  advances  through 
steady,  continuous  growth.  Growth  in  the  1980s  is  assumed  to  be  equal 
to  the  growth  rate  experienced  during  the  1960s  or  1970s.  Various 
engine  research  and  exploratory  development  programs  of  the  military  and 
the  manufacturers  are  taken  to  continue  at  a  constant  rate.  Clearly,  if 
the  degree  of  support  for  such  work  changes,  or  its  effectiveness 
changes  (in  either  direction),  the  estimating  relationships  based  on 
past  trends  will  lose  some  validity.  Also,  changes  in  acquisition 
procedures  that  may  influence  decisions  on  when  to  apply  new  technology 
will  also  affect  the  usefulness  of  these  relationships.  In  general, 
meaningful  use  of  TOA  to  predict  risks  must  allow  for  the  influence  of 
future  changes  in  technology  development  and  application. 

LIMITATIONS 

The  results  described  in  this  study  are  intended  for  estimating  the 
cost  of  large,  modern  future  aircraft  engines  in  the  context  of  long- 
range  planning  studies .  Any  new  engine  to  be  estimated  must  be 
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consistent  with  the  basic  assumptions  under  which  the  CERs  were  derived. 
Specifically,  the  CERs  apply  to  the  development  and  pricing  practices 
similar  to  those  of  the  1960s  and  1970s  and  assume  basic  gas  turbine 
design  will  be  similar  to  that  of  the  engines  of  today.  Obvious 
differences  are  evident,  but  no  fundamental  change  is  foreseen  without 
stepping  outside  the  definition  of  an  aircraft  turbine  engine.  Apart 
from  the  usual  incremental  increases  in  component  efficiency  obtained  by 
newer  alloys  and  novel  component  designs,  it  would  appear  that  the 
estimating  relationships  previously  discussed  can  be  used  for  most  of 
the  proposed  aircraft  designs. 

Certain  designs  incorporate  features  that,  although  radical  in 
appearance,  still  permit  the  estimating  relationships  to  be  used  as  a 
base  for  subsequent  adjustment.  Swivelling  nozzles,  for  example,  are 
appendages  that  supplement  the  basic  engine. 

The  lift  engine  ij  one  application  where  difficulties  will  arise. 
Because  it  is  considerably  different  in  usage  and  design,  it  is  doubtful 
whether  the  estimating  relationship  derived  in  this  study  will  apply. 
Another  case  would  be  where  the  material  content  of  the  engine  was 
radically  different  from  those  engines  making  up  the  sample--for 
example,  an  engine  having  cold  or  hot  section  parts  largely  fabricated 
from  composite  or  ceramics  material.  Fortunately  for  the  estimator,  the 
engine  development  process  has  been  an  evolutionary  one.  If  this  trend 
continues,  the  estimating  relationships  derived  in  this  study  will  apply 
to  future  U.S.  developed  and  produced  aircraft  turbine  engines . 
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FUTURE  WORK 

Data  Base  Expansion 

Further  data  base  expansion  seems  to  be  the  key  to  further 
improvements  in  engine  cost  estimating  relationships.  Regression 
diagnostics  have  consistently  identified  the  most  recently  developed 
engines  as  having  unusually  strong  influence  on  the  CERs.  Therefore, 
the  CERs  should  be  updated  as  data  for  new  engines  become  available. 
This,  of  course,  is  restricted  by  the  limited  number  of  engines  in  work 
at  any  given  time,  so  progress  will  be  slow.  Most  military  turbine 
engines  likely  to  be  developed  in  the  near  future  will  be  turbofans,  if 
recent  trends  continue.  Turbofan  engines  are  outnumbered  now  in  the 
data  base  by  turbojets.  As  more  turbofans  are  added  to  the  data  base, 
the  total  sample  will  become  more  representative  of  that  type  of  engine, 
permitting  development  of  estimating  relationships  even  more  useful  than 
those  obtained  in  this  study 

Parameter  Interrelationships 

Further  investigation  of  potential  explanatory  variables  may  also 
be  profitable.  Those  selected  and  our  criteria  greatly  affected  the 
structure  of  the  derived  models.  It  is  unlikely  that  everything  has 
been  done  that  can  be  done  in  this  area.  One  avenue  that  may  be 
fruitful  is  the  study  of  the  interrelationships  of  the  explanatory 
variables.  Thrust,  Mach,  and  temperature  are  one  important  set. 
Understanding  their  relationships  to  each  other  more  accurately  should 
help  to  point  the  way  to  new  approaches  to  studying  their  influence  on 


cost . 
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These  areas  for  further  improvement  of  engine  cost  estimating  are 
really  extensions  of  the  work  done  in  this  study.  Just  as  improvements 
were  achieved  during  this  present  effort,  so  should  improvements  result 
from  future  work,  as  the  availability  of  new  data  will  allow. 

Derivative  Engines 

The  CERs  developed  assume  that  each  new  engine  starts  from  the 
drawing  board  and  follows  the  "normal"  course  to  MQT  and  subsequent 
production.  They  do  not  lend  themselves  to  estimating  the  incremental 
development  costs  resulting  from  a  growth  or  a  derivative  version  of  an 
earlier,  similar  engine.  But  as  budget  constraints  and  high  development 
costs  combine  to  limit  the  development  of  all  new  military  engines, 
strong  efforts  have  been  made  to  encourage  and  capitalize  on  this  type 
of  evolutionary  development. [2]  Given  that  growth  and  derivative  engines 
are  not  only  a  phenomena  of  the  past,  but  perhaps  the  wave  of  the 
future,  estimating  tools  should  be  developed  that  address  the  costs 
associated  with  evolutionary  growth  of  a  common  engine  family. 

[2]  The  Engine  Model  Derivative  Program  (EMDP) ,  managed  by  the  ASD 
Deputy  for  Propulsion,  is  an  example  of  this  philosophy.  This  program 
is  designed  to  improve  performance  and  durability  at  lower  costs.  It 
exploits  existing  technology  and  applies  it  to  a  current  engine  model 
to  evolve  a  newer  model.  Under  EMDP,  the  resulting  newer  model  must 
have  better  characteristics  than  the  existing  engine. 


REGRESSION  EQUATIONS  AND  STATISTICS 


Table  A.l 

REGRESSION  DIAGNOSTICS  FOR  MQTDEVCOST  EQUATION 


SUM  OF 

MEAN 

SOURCE 

DF 

SQUARES 

SQUARE 

F  VALUE 

PROB>F 

MODEL 

3 

1173246 

391082 

54.562 

0.0001 

ERROR 

12 

86011.897 

7167.658 

C  TOTAL 

15 

1259258 

ROOT 

MSE 

84.662022 

R- SQUARE 

0.9317 

DEP  MEAN 

433.438 

ADJ  R-SQ 

0.9146 

C.V. 

19.53269 

PARAMETER 

STANDARD 

T  FOR  HO: 

VARIABLE 

DF 

ESTIMATE 

ERROR 

PARAMETERS  PROB  >  ABS(T) 

INTERCEPT  1 

-845.804 

154.190 

-5.485 

0.0001 

THRMAX 

1 

0.005338497  0 

.00272854 

1.957 

0.0741 

MACH 

1 

249.838 

38.302369 

6.523 

0.0001 

TEMP 

1 

0.312816 

0.075867 

4.123 

0.0014 

COLLINEARITY 

DIAGNOSTICS 

VARIANCE 

PROPORTIONS 

CONDITION 

PORTION 

PORTION  PORTION 

PORTION 

NUMBER  EIGENVALUE  INDEX 

INTERCEP 

THRMAX  MACH 

TEMP 

1 

3 

.746  1.000 

0.0013 

0.0092  0.0060 

0.0009 

2 

0.181933  4.537 

0.0181 

0.6128  0.0216 

0.0026 

3 

0.064423  7.625 

0.0523 

0.0131  0.9678 

0.0230 

4 

0.007946  21.711 

0.9283 

0.3649  0.0047 

0.9734 
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Table  A. 3 

REGRESSION  DIAGNOSTICS  FOR  TOTDEVCOST  EQUATION 


SUM  OF 

MEAN 

SOURCE 

DF 

SQUARES 

SQUARE 

F  VALUE 

PROB>F 

MODEL 

3 

3574504 

1191501 

30.998 

0.0001 

ERROR 

25 

960957 

38438.269 

C  TOTAL 

28 

4535461 

ROOT 

MSE 

196.057 

R-SQUARE 

0.7881 

DEP 

MEAN 

628.207 

ADJ  R-SQ 

0.7627 

C.V. 

31.20895 

DURBIN -WATSON 

1.896 

PARAMETER 

STANDARD 

T  FOR  HO: 

VARIABLE 

DF 

ESTIMATE 

ERROR 

PARAMETERS  PROB 

;  >  ABS(T) 

INTERCEP 

1 

- 

525.763 

132.433 

-3.970 

0.0005 

THRMAX 

1 

0 

.022730  0 

.00557228 

4.079 

0.0004 

MACH 

1 

401.022 

77.641102 

5.165 

0.0001 

QTY 

1 

0 

.070436 

0.020297 

3.470 

0.0019 

COLLINEARITY 

DIAGNOSTICS 

VARIANCE 

PROPORTIONS 

CONDITION 

PORTION 

PORTION  PORTION 

PORTION 

NUMBER  EIGENVALUE 

INDEX 

INTERCEP 

THRMAX  MACH 

QTY 

1 

3 

.312 

1.000 

0.0064 

0.0122  0.0055 

0.0311 

2 

0.542866 

2.470 

0.0057 

0.0185  0.0070 

0.9553 

3 

0.106281 

5.582 

0.1844 

0.9095  0.0562 

0.0070 

4 

0.039282 

9.182 

0.8035 

0.0599  0.9313 

0.0066 
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Table  A. 5 

REGRESSION  DIAGNOSTICS  FOR  PROCOST  EQUATION 


SUM  OF 

MEAN 

SOURCE 

DF 

SQUARES 

SQUARE 

F  VALUE 

PROB>F 

MODEL 

3 

13743738 

4581246 

135.855 

0.0001 

ERROR 

18 

606990 

33721.690 

C  TOTAL 

21 

14350728 

ROOT 

MSE 

183.635 

R-SQUARE 

0.9577 

DEP  MEAN 

914.000 

ADJ  R-SQ 

0.9507 

C.V. 

20.09132 

PARAMETER 

STANDARD 

T  FOR  HO: 

VARIABLE 

DF 

ESTIMATE 

ERROR 

PARAMETERS 

PROB  >  ABS(T) 

INTERCEP 

1 

-2228.140 

309.448 

-7.200 

0.0001 

THRMAX 

1 

0.043007 

0.005876019 

7.319 

0.0001 

MACH 

1 

243.250 

81.149876 

2.998 

0.0077 

TEMP 

1 

0.968842 

0.157582 

6.148 

0.0001 

COLLINEARITY 

DIAGNOSTICS 

VARIANCE 

PROPORTIONS 

CONDITION 

PORTION 

PORTION 

PORTION 

PORTION 

NUMBER 

EIGENVALUE 

INDEX 

INTERCEP 

THRMAX 

MACH 

TEMP 

1 

3.684 

1.000 

0.0011 

0.0102 

0.0064 

0.0008 

2 

0.243189 

3.892 

0.0109 

0.5277 

0.0160 

0.0017 

3 

0.066085 

7.466 

0.0429 

0.0138 

0.9728 

0.0201 

4 

0.006785 

23.301 

0.9451 

0.4483 

0.0048 

0.9773 
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Table  A. 7 

REGRESSION  DIAGNOSTICS  FOR  TOA  EQUATION 


SUM  OF 

MEAN 

SOURCE 

DF 

SQUARES 

SQUARE 

F  VALUE 

PROB>F 

MODEL 

3 

39482.257 

13160.752 

179.243 

0.0001 

ERROR 

25 

1835.605 

73.424220 

C  TOTAL 

28 

41317.862 

ROOT 

MSE 

8.568793 

R- SQUARE 

0.9556 

DEP  ! 

SEAN 

62.931034 

ADJ  R-SQ 

0.9502 

C.V. 

13.61616 

PARAMETER 

STANDARD 

T  FOR  HO: 

VARIABLE 

DF 

ESTIMATE 

ERROR 

PARAMETERS  PROB 

i  >  absct: 

INTERCEP 

1 

-46.917999 

18.003362 

-2.606 

0.0152 

THRWGT 

1 

8.727609 

1.403500 

6.218 

0.0001 

TEMP 

1 

0.046289  0 

.00730724 

6.335 

0.0001 

SFCMIL 

1 

-31.263928 

8.037125 

-3.890 

0.0007 

COLLINEARITY 

DIAGNOSTICS 

VARIANCE 

PROPORTIONS 

CONDITION 

PORTION 

PORTION  PORTION 

PORTION 

NUMBER  EIGENVALUE  INDEX 

INTERCEP 

THRWGT  TEMP 

SFCMIL 

1 

3 

.772  1.000 

0.0005 

0.0041  0.0006 

0.0028 

2 

0.198962  4.354 

0.0012 

0.1700  0.0008 

0.0971 

3 

0.024266  12.468 

0.0569 

0.6399  0.1164 

0.5144 

4 

0.004405  29.264 

0.9414 

0.1860  0.8822 

0.3856 
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